ÇѱÛÁ¦¸ñ(Korean Title) |
Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words |
¿µ¹®Á¦¸ñ(English Title) |
Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words |
ÀúÀÚ(Author) |
Tae-Seok Lee
Hyun-Young Lee
Seung-Shik Kang
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 18 NO. 03 PP. 0344 ~ 0358 (2022. 06) |
Çѱ۳»¿ë (Korean Abstract) |
|
¿µ¹®³»¿ë (English Abstract) |
Text summarization is the task of producing a shorter version of a long document while accurately preserving the main contents of the original text. Abstractive summarization generates novel words and phrases using a language generation method through text transformation and prior-embedded word information. However, newly coined words or out-of-vocabulary words decrease the performance of automatic summarization because they are not pre-trained in the machine learning process. In this study, we demonstrated an improvement in summarization quality through the contextualized embedding of BERT with out-of-vocabulary masking. In addition, explicitly providing precise pointing and an optional copy instruction along with BERT embedding, we achieved an increased accuracy than the baseline model. The recall-based word-generation metric ROUGE-1 score was 55.11 and the word-order-based ROUGE-L score was 39.65. |
Å°¿öµå(Keyword) |
BERT
Deep Learning
Generative Summarization
Selective OOV Copy Model
Unknown Words
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|